Skip to content

fix: handle empty JSON-LD framing responses and prevent log injection#40

Merged
wiresio merged 6 commits intoeclipse-thingweb:mainfrom
kaiprodevops:fix/uncaught-jsondecodeerror-502
Apr 15, 2026
Merged

fix: handle empty JSON-LD framing responses and prevent log injection#40
wiresio merged 6 commits intoeclipse-thingweb:mainfrom
kaiprodevops:fix/uncaught-jsondecodeerror-502

Conversation

@kaiprodevops
Copy link
Copy Markdown
Contributor

Summary

Fixes uncaught JSONDecodeError when the JSON-LD framing process fails or returns empty output, and prevents log injection vulnerabilities in error handling.

Problem

When frame-jsonld.js returns an empty string (e.g., due to unreachable custom context URLs or malformed N-Triples), the call to json.loads(jsonld_compacted) in frame_td_nt_content() raises an uncaught json.decoder.JSONDecodeError. Flask's default error handler converts this to an unstructured HTTP 500 Internal Server Error with no meaningful message.

This affects any GET /things/{id} request for a Thing Description whose custom context URL was unreachable at ingestion or framing time.

Root Cause

  1. No validation of subprocess output before JSON parsing
  2. No exit code checking for the Node.js framing process
  3. Potential for log injection via unsanitized stderr output
  4. Information leakage through detailed error messages in HTTP responses
  5. Risk of pipe buffer deadlock using write()/read() pattern

Solution

1. Improved Subprocess Communication (tdd/common.py)

  • Use communicate() instead of write()/read() to prevent OS pipe buffer deadlocks (Python official recommendation)
  • Check subprocess exit code (returncode) for reliable error detection
  • Sanitize stderr output using repr() and truncate to 200 chars to prevent log injection attacks
  • Return generic error messages to clients while logging detailed errors server-side

2. Enhanced Error Handling (tdd/td.py)

  • Validate framing output before JSON parsing (check for empty/whitespace-only strings)
  • Catch JSONDecodeError and convert to application-specific exception
  • Log sanitized error details without exposing them to clients

3. JavaScript Error Handling (js-src/frame-jsonld.js)

  • Add try-catch block around framing logic
  • Output errors to stderr for proper logging
  • Exit with non-zero status code (1) on failure

4. New Exception Type (tdd/errors.py)

  • Introduce ExternalDependencyError with HTTP 502 status code
  • Semantically correct for upstream dependency failures (context URLs, Node.js process)
  • Supports multilingual error messages (EN, FR, DE)

Security Improvements

Log Injection Prevention

Before (vulnerable):

logging.error(f"frame-jsonld.js failed: {error_message}")  # Newlines can inject fake log entries

After (secure):

sanitized_stderr = repr(stderr_data.strip()[:200])
logging.error(f"frame-jsonld.js failed with exit code {p.returncode}: {sanitized_stderr}")

Information Leakage Prevention

Before (exposes internals):

{
  "detail": "Framing process failed: Error: Cannot resolve context https://attacker.com/probe",
  "status": 502
}

After (generic):

{
  "title": "External Dependency Error",
  "detail": "An upstream dependency failed to process the data. JSON-LD framing process failed.",
  "status": 502
}

Changes

Modified Files

  • tdd/errors.py - Add ExternalDependencyError exception class
  • tdd/common.py - Improve frame_nt_content() with secure subprocess handling
  • tdd/td.py - Improve frame_td_nt_content() with input validation and error handling
  • js-src/frame-jsonld.js - Add error handling and non-zero exit codes

New Files

  • tdd/tests/test_frame_error_handling.py - Comprehensive test suite (6 test cases)

Testing

Test Coverage

  • ✅ Subprocess failure with non-zero exit code
  • ✅ Empty response from framing engine
  • ✅ Malformed JSON output
  • ✅ Valid response processing
  • ✅ Subprocess success scenario
  • ✅ Log injection attack prevention

Test Results

85 passed, 680 warnings in 10.65s

All existing tests pass, confirming backward compatibility.

Related Issues

Fixes #39

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>
@wiresio
Copy link
Copy Markdown
Member

wiresio commented Apr 10, 2026

Thanks @kaiprodevops for tackling this issue. Here is Claude's feedback:

PR Review: fix/uncaught-jsondecodeerror-502

✅ What's Correct

  1. communicate() replaces write()/read() — correct. The original pattern can deadlock when the subprocess output fills the OS pipe buffer before p.wait() is called. Using communicate() is the documented Python idiom.

  2. Exit code checking — correct. The Node.js try-catch now ensures non-zero exit on any failure, and common.py checks returncode != 0.

  3. ExternalDependencyError integrates automaticallyExternalDependencyError inherits from AppException, so the existing @app.errorhandler(AppException) in init.py handles it without any changes needed there. The 502 status code is semantically appropriate.

  4. Defense-in-depth layering — the empty-string check and JSONDecodeError catch in td.py are valid secondary guards for the case where common.py doesn't raise (e.g., returncode 0 but empty stdout).

  5. JS fix is correct — the try-catch handles both synchronous errors (bad JSON input) and async rejections (framing fails). process.exit(1) ensures a non-zero exit code is propagated.


❌ Bugs / Deficiencies

1. Tests 1, 5, 6 fail without built webpack artifacts (real bug)

Tests that mock subprocess.Popen fail because resources.path("tdd.lib", "frame-jsonld.js") is called before Popen and throws FileNotFoundError when tdd/lib/frame-jsonld.js doesn't exist (it must be built with npm run build). Confirmed locally:

FAILED test_frame_nt_content_subprocess_failure
FAILED test_frame_nt_content_subprocess_success
FAILED test_frame_nt_content_log_injection_prevention

FileNotFoundError: [Errno 2] No such file or directory: '.../tdd/lib/frame-jsonld.js'

The tests need to also mock resources.path in tdd.common, e.g.:

@patch("tdd.common.resources.path")
@patch("tdd.common.subprocess.Popen")
def test_frame_nt_content_subprocess_failure(mock_popen, mock_resources_path):
    mock_resources_path.return_value.__enter__ = lambda s: "fake/path"
    mock_resources_path.return_value.__exit__ = Mock(return_value=False)
    ...

Note: The existing test_GET_thing_OK has the same environment dependency, so this is partially pre-existing — but the PR introduces three new tests with the same flaw.

2. td_id not sanitized in td.py log message (inconsistency)

# tdd/td.py
logging.error(f"JSON decode failed for TD {td_id}: {repr(str(exc)[:100])}")

td_id comes from the URL path and could contain newlines, injecting fake log entries — the same attack the PR claims to prevent in common.py. Should be repr(str(td_id)[:200]).


⚠️ Minor Issues

3. Test 6 positive assertion is trivially true

assert "\\n" in logged_message or "'" in logged_message

repr() always wraps strings in single quotes, so "'" in logged_message is always True. The or makes the whole assertion trivially pass even if \n wasn't escaped. The meaningful assertion is the one that follows:

assert "\nFAKE LOG ENTRY" not in logged_message  # this is fine

The positive assertion should be strengthened to just assert "\\n" in logged_message.

4. details appears in English inside FR/DE messages

message_fr=f"Une dépendance en amont n'a pas pu traiter les données. {details}",

details (e.g., "JSON-LD framing process failed.") is always English, appearing verbatim in French and German responses. Minor UX inconsistency.

5. finally { rl.close() } misleads about error-path behavior

In Node.js, process.exit(1) immediately terminates the process and does not execute the finally block. So rl.close() is only reachable in the success path — the same behavior as the original placement. This is not a bug (exiting the process anyway), but is potentially confusing.


🔍 Missing Test Coverage

No integration test verifies the full HTTP lifecycle: that when framing fails, the endpoint returns:

  • Status 502
  • Content-Type: application/problem+json
  • {"title": "External Dependency Error", "status": 502, ...}

The existing test infrastructure in conftest.py (with test_client and httpx_mock) supports writing such a test.


Summary

Area Verdict
Core deadlock fix (communicate()) ✅ Correct
Exit code checking ✅ Correct
Flask error handler integration ✅ No changes needed
JS try-catch + process.exit(1) ✅ Correct
Log injection prevention in common.py ✅ Correct
Tests 1/5/6 without built assets ❌ Broken
td_id sanitization in log ❌ Missing
Test 6 assertion strength ⚠️ Trivially true
FR/DE details translation ⚠️ English only
HTTP 502 integration test ⚠️ Missing

@wiresio
Copy link
Copy Markdown
Member

wiresio commented Apr 10, 2026

How to fix the tests:

npm run build would produce exactly tdd/lib/frame-jsonld.js (confirmed from webpack config: output goes to tdd/lib/[name].js). The failing tests would then pass, because resources.path("tdd.lib", "frame-jsonld.js") would resolve successfully and the Popen mock would be reached.

Two changes would be needed:


1. Run build before tests — tox.ini

Add a commands_pre directive to the [testenv] that runs the JS build before pytest:

 [testenv]
 deps =
     pytest
 extras = dev
+commands_pre =
+  npm install --prefer-offline
+  npm run build
 commands =
   {envpython} -m pytest {posargs:tdd/tests}

--prefer-offline avoids re-downloading packages if node_modules is already present. If CI always starts clean, use plain npm install.


2. (Optional but cleaner) Add a pytest session-scoped fixture that guards the build — conftest.py

Alternatively, add a session fixture that checks for the built file and gives a clear error if the build is missing, rather than a cryptic FileNotFoundError at test collection time:

# in tdd/tests/conftest.py
import subprocess
from pathlib import Path

@pytest.fixture(scope="session", autouse=True)
def ensure_js_build():
    lib_path = Path(__file__).parent.parent / "lib" / "frame-jsonld.js"
    if not lib_path.exists():
        pytest.exit(
            "tdd/lib/frame-jsonld.js not found. Run 'npm install && npm run build' first.",
            returncode=1,
        )

Which approach to pick

Approach Pros Cons
tox.ini commands_pre Fully automated, CI-safe Requires Node/npm in the test environment
conftest.py guard Clear error message for dev Doesn't fix the issue, just reports it better

Both together is the complete solution: tox.ini ensures the build runs automatically, and the conftest.py guard gives a human-readable error if someone runs pytest directly without the tox wrapper.

The tox.ini change is the only strictly necessary change to make Tests 1/5/6 pass in CI without switching to mocking resources.path.

Signed-off-by: kaiprodev <kaiprodevops@gmail.com>
Signed-off-by: kaiprodev <kaiprodevops@gmail.com>
Signed-off-by: kaiprodev <kaiprodevops@gmail.com>
Signed-off-by: kaiprodev <kaiprodevops@gmail.com>
Signed-off-by: kaiprodev <kaiprodevops@gmail.com>
@kaiprodevops
Copy link
Copy Markdown
Contributor Author

Thanks @wiresio for the thorough review and detailed feedback! I've addressed all the issues you identified. Please review the changes and let me know if any adjustments are needed.

@wiresio wiresio merged commit 3ca1b87 into eclipse-thingweb:main Apr 15, 2026
3 checks passed
@wiresio
Copy link
Copy Markdown
Member

wiresio commented Apr 15, 2026

Thanks @kaiprodevops!

@kaiprodevops
Copy link
Copy Markdown
Contributor Author

You're very welcome! @wiresio Thanks again for the excellent guidance during the code review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unhandled JSONDecodeError when JSON-LD framing returns empty output

2 participants